Automatic language identification using long short-term memory recurrent neural networks
نویسندگان
چکیده
This work explores the use of Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) for automatic language identification (LID). The use of RNNs is motivated by their better ability in modeling sequences with respect to feed forward networks used in previous works. We show that LSTM RNNs can effectively exploit temporal dependencies in acoustic data, learning relevant features for language discrimination purposes. The proposed approach is compared to baseline i-vector and feed forward Deep Neural Network (DNN) systems in the NIST Language Recognition Evaluation 2009 dataset. We show LSTM RNNs achieve better performance than our best DNN system with an order of magnitude fewer parameters. Further, the combination of the different systems leads to significant performance improvements (up to 28%).
منابع مشابه
Language Identification in Short Utterances Using Long Short-Term Memory (LSTM) Recurrent Neural Networks
Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN system running on limited computational resources...
متن کاملEnd-to-End Language Identification Using Attention-Based Recurrent Neural Networks
This paper proposes a novel attention-based recurrent neural network (RNN) to build an end-to-end automatic language identification (LID) system. Inspired by the success of attention mechanism on a range of sequence-to-sequence tasks, this work introduces the attention mechanism with long short term memory (LSTM) encoder to the sequence-to-tag LID task. This unified architecture extends the end...
متن کاملA Step Beyond Local Observations with a Dialog Aware Bidirectional GRU Network for Spoken Language Understanding
Architectures of Recurrent Neural Networks (RNN) recently become a very popular choice for Spoken Language Understanding (SLU) problems; however, they represent a big family of different architectures that can furthermore be combined to form more complex neural networks. In this work, we compare different recurrent networks, such as simple Recurrent Neural Networks (RNN), Long Short-Term Memory...
متن کاملArticulatory movement prediction using deep bidirectional long short-term memory based recurrent neural networks and word/phone embeddings
Automatic prediction of articulatory movements from speech or text can be beneficial for many applications such as speech recognition and synthesis. A recent approach has reported stateof-the-art performance in speech-to-articulatory prediction using feed forward neural networks. In this paper, we investigate the feasibility of using bidirectional long short-term memory based recurrent neural n...
متن کاملSequence Modeling with Recurrent Tensor Networks
We introduce the recurrent tensor network, a recurrent neural network model that replaces the matrix-vector multiplications of a standard recurrent neural network with bilinear tensor products. We compare its performance against networks that employ long short-term memory (LSTM) networks. Our results demonstrate that using tensors to capture the interactions between network inputs and history c...
متن کامل